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ABSTRACT 



21cm observations have the potential to revolutionize our understanding of the 
high-redshift universe. Whilst extremely bright radio continuum foregrounds exist at 
these frequencies, their spectral smoothness can be exploited to allow efficient fore- 
ground subtraction. It is well-known that -regardless of other instrumental effects-this 
removes power on scales comparable to the survey bandwidth. We investigate associ- 
ated systematic biases. We show that removing line-of-sight fluctuations on large scales 
aliases into suppression of the 3D power spectrum across a broad range of scales. This 
bias can be dealt with by correctly marginalizing over small wavenumbers in the ID 
power spectrum; however, the unbiased estimator will have unavoidably larger vari- 
ance. We also show that Gaussian realizations of the power spectrum permit accurate 
and extremely rapid Monte-Carlo simulations for error analysis; repeated realizations 
of the fully non-Gaussian field are unnecessary. We perform Monte-Carlo maximum- 
likelihood simulations of foreground removal which yield unbiased, minimum variance 
estimates of the power spectrum in agreement with Fisher matrix estimates. Fore- 
ground removal also distorts the 21cm PDF, reducing the contrast between neutral 
and ionized regions, with potentially serious consequences for efforts to extract infor- 
mation from the PDF. We show that it is the subtraction of large-scales modes which 
is responsible for this distortion, and that it is less severe in the earlier stages of reion- 
ization. It can be reduced by using larger bandwidths. In the late stages of reionization, 
identification of the largest ionized regions (which consist of foreground emission only) 
provides calibration points which potentially allow recovery of large-scale modes. Fi- 
nally, we also show that: (i) the broad frequency response of synchrotron and free-free 
emission will smear out any features in the electron momentum distribution and en- 
sure spectrally smooth foregrounds; (ii) extragalactic radio recombination lines should 
be negligible foregrounds. 
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1 INTRODUCTION 

21cm observations have the potential to revolutionize our 
understandin g of the high-re d shift u niverse (for recent 
review s, see iFurlanetto et alj (120061 ): iMorales fc Wvithel 
(2009)). By examining the imprints of the first galaxies on 
the intergalactic medium, they are complementary to (and 
potentially much more powerful than) direct surveys for 
proto-galaxies via Lya and dropout techniques; furthermore, 
they can survey the universe on much larger scales. More- 
over, unlike Ly-series absorption studies of high-redshift 



quasars, they constitute a fully three-dimensional probe 
which is not subject to saturation effects, and require no 
background sources. A host of upcoming low-frequency in- 
terferometers aim to detect fluctuations in 21cm emission 
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from the Epoch of Reionization as a key science goal, includ- 
ing GMRT0 MWA0 LOFAR0 21CMA0 PAPERS SKA0. 

One of the most difficult aspects of high- z 21 cm mea- 
surements is the extreme brightness of the astronomical 
foregrounds relative to the cosmological signal (which has 
a brightness temperature 5T__, ~ 20 mK): Galactic syn- 
chrotron emission has (at best) T g a i ~ 200-1000 K at 
these frequencies (jShaver et alj |l999) Extragalactic fore- 
grounds, includi ng radio galaxies, free - free emitters, and 
galaxy clusters jDi Matteo et al.1 12002| ; lOh fc Mackl 120031 ; 
|Pi Matteo et alj 12004 )7 are also significantly brighter than 
the 21 cm signal. Although these can be removed, the 
residual noise is larg e and numerous prac tical diflicul- 
ties arise (see §9.3 of Furlanetto et al.l 120061 , and §3.4 of 
IMorales fc Wvithel [20091. Outstanding amongst these are 
terrestrial and satellite radio- frequency interference (RFI); 
ionospheric refraction and scintillation; and calibration er- 
rors in the direction-dependent gain and polarization re- 
sponse of the antennae. A particularly worrisome problem 
is the chromatic PSF of the instrument, which couples spa- 
tial fluctuations of spectrally smooth point sources into 
fluctuations in the frequency domain, an effect commonly 
dubb e d " mode- mixing" . St u dies of the latt e r jLiuetal ' 



20091 ; iBowman et al. I 120091 ; iLiu et al l l009bl ; iDatta et al 



20101 ) show that while in principle this can be cleaned, 
the required calibration accuracy is formidable. By con- 
trast, once the bright point sources are removed and the 
instrumental effects dealt with, removal of the spectrally 
smooth confused foreground of faint sources and Galac- 
tic emission is comparatively easy, and thought to be a 
solve d problem JZaldarriaea et ail 1 2004; Morale s fc Hewitt 



20041; ISantos et all 120051; iWang et al.l 120061 ; IMorales et al. 
2006al ; McQuinn et al.l 120061 ). In this paper, we focus only 



on this latter, comparatively 'easy' stage of continuum fore- 
ground removal. We show that there are systematic effects 
and biases-generally associated with the loss of large-scale 
power — which if not carefully taken into account, can lead 
to spurious results. 

The most widely discussed method for continuum fore- 
ground removal inv olves fitting and subtracting a smooth 
function to the data JWang et alj|2006l ; IMorales et ai1l2006bl ; 
iMcQuinn et al.1 120061 7 This "trend removal" has a rigor- 
ous statistical justication as a means of extracting a tiny 
fluctuating signal from a huge, slowly varying background 
(|Rvbicki fc Press! I1992T ). It corresponds to projecting out 
low-o rder, slowly- varying modes in the data ( McQuinn et al. 
2006), with a price: cleaning also attenuates the signal by 
removing its large-scale fluctuations. It therefore severely 
restricts the spatial dynamic range of 21 cm observatories. 
For instance, MWA measurements of the 21cm power spec- 
trum will be limited to roughly a deca de in scale, from 
fc ~ 0.1 - 1/iMpc" 1 (|Lidz et al.l |2008| V Recent simula- 
tion studies have confirmed the efficacy of trend removal 



dGleser et al ll20Qg| ; Ijelic et al l |200S| ; IBowman et"ai1 120081 ; 
lHarker et aljhoioh . and it is generally accepted as the best 
(and only mainstream) means of removing foregrounds. 



While the loss of large scale power is widely known and 
accepted as unavoidable, the systematic biases introduced 
by the process of foreground removal are much less well 
known. In this paper, we examine the consequences of re- 
moval of large-scale power in the line-of-sight direction in 
three key measures: (i) power spectrum. Foreground clean- 
ing is often incorrectly characterized as removing all modes 
with k < fcmin- In fact, it removes all line of sight modes with 
fcii < fc m j n . We find that the converse of well-know n alias- 
ing effects ( Kaiser fc Peac ock 1991; Lidz et al. 2006) implies 
that removing large-scale power in radial modes removes 
small-scale power in transverse modes, creating an unavoid- 
able bias across a range of scales (rather than just k < fc m i n ) 
in measurements of the 3D power spect rum. While this effect 
has been noticed b efore in simulations |Bowman et al .^20091 : 
lHarker et al.|[200^ ). we provide a quantitative explanation 
and calculation of this effect. We show that this bias can be 
accounted for by correctly marginalizing over the large-scale 
modes along the line of sight, but at the expense of increased 
sample variance. We furthermore show that purely Gaus- 
sian realizations of the recovered power spectrum provide an 
accurate and extremely rapid means of generating Monte- 
Carlo error estimates for the power spectrum: fully non- 
Gaussian realizations are not necessary, (ii) PDF. The effect 
of foreground cleaning on the PDF ha s not been explored , 
apart from its effect on the skewness (Ha rker et al.1 12009). 
We find that foreground cleaning causes unacceptable dis- 
tortion of the one-point function. The PDF has surprisingly 
rich inf ormation about t h e pro g ress and topology o f reion - 
ization (iFurlanetto et all (|2004h;IWyithe fc Moralesl (|200 it ); 
lHarker et al.l {[200$ ); [ichikawa et al.l (|201fj| ')'). but these will 
be obscured unless these artifacts can be removed, (iii) To- 
mography. Even the first generation of instruments will be 
able to image the very largest HII regions ( > 20 — 30 Mpc 
comoving), perhaps sourced by high-redshift quasars similar 
to those found by SDSS at z ~ 6. In principle, this might al- 
low us to measure the neutral fraction by simply measuring 
the temperature contrast 5Tb between the HII region and 
the rest of the survey area (which contains many unresolved 
HII regions). We find that while the relative shapes of HII 
regions are preserved by foreground cleaning, the tempera- 
ture contrast is signifi cantly distorted, bedeviling efforts to 
measure 5Tb (see also iGeil et al.l (|2008l )V 
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The outline of this paper is as follows. In g2J we describe 
our methodology for generating 21cm boxes, telescope noise, 
foregrounds, and removing foregrounds. In H3I4I5I we discuss 
the effects of foreground removal on the power spectrum, to- 
mography, and PDF respectively. In |6] we discuss possible 

improvements to foreground removal methods, and summa- 

Murchison Widcficld Array, http://www.haystack.mit.edu/ast/arrays/mwa/ , . ■ cpti t a i- m j- u j 

, ,. A h ^ 1 1 1 _ _______ ,__. __ i \ — — ■ '■ ria e con clusions m g7| In AppendixlAI we discuss bounds on 

the smoothness of the synchrotron and free-free continuum 
Probe — Ei<H. foregrounds. In Appendix [B] we discuss the integrated ex- 
tragalactic radio recombination line background, and show 
that it is unlikely to be a show-stopper. 



1 Giant Metrewave 
|http: //www. gmrt.ncra.tifr. res. in/| 



3 Low Frequency Array, http://www.lofar.org/ 

4 21 Centimeter Array, [http: //w eb. phys.cmu.edu/~past/ 

5 Precision Array to 
http: / /astro. berkeley.edu/~dbacker/eor/ 
b Square Kilometre Array, http: / /www. skatelescope.org/ 



Systematic Effects of Foreground Removal in 21cm Surveys of Reionization 3 



2 METHOD 

We now describe our simulations of 21cm signal, telescope 
noise, foreground, and the foreground removal process. It 
is important to note that this paper is squarely aimed 
at studying the biases introduced by subtraction of large- 
scale power which is ubiquitous to all foreground clean- 
ing techniques. We do not aim at any advances in the 
state-of-the-art with regards to simulation or the clean- 
ing process (except where it pertains to loss of large- 
scale modes; fj6}. Thus, our simulations and cleaning al- 
gorithm are relatively simple. We do not consider 'mode- 
mixing' and the conse q uences of bright source subtrac- 



tion <|Lhiet_aLJ 2009a 



Liu et al r2009bll ; iBowman et al.l 



l|2009h ; iDatta et all (|2010j)); our for eground model is much 
less sophisticated than others (e.g.. [jelic et al.l l|2008h ); we 
largely ignore instrumental effects, such as leakage of po- 
larized foregrou nds to the total signal (| Jelic et al.l |2010| ; 
iGeil et ai1l2010h ; and we do not examine the subtleties of 
vario us foreground cleaning algorithms (e.g., l|Harker et al.l 
2009)). Indeed, often to make clear the biases that cleaning 
introduces, we omit noise from the plot, since many clean- 
ing algorithms are linear processes (see discussion in £|2.4[) . 
At this point, there are many detailed, 'end-to-end' simula- 
tion pipelines which incorporate all of these effects, and we 
have nothing to add to that literature. Rather, we aim at a 
theoretical understanding of the impact of the loss of large- 
scale power on various statistics, which can be understood 
robustly, independent of these further complications. Whilst 
many of these effects can be derived analytically, it is more 
transparent to show their effects on simulated maps. 



2.1 The 21cm Signal 

To sim ulate 21cm signal, we use the se mi-numeric simulation 
DexM l|Mesinger fc Furlanettol [20071 ') . which can efficiently 
generate ionization maps of the IGM at high redshift far 
more rapidly (and for much larger boxes) than full numer- 
ical simulations, whilst showing excellent agreement with 
N-body simulations with full radiative transfeiQ. To briefly 
summarize their algorithm: 3D realizations of linear density 
and veloci ty fields are f i rst ge n erated. Then, an excu rsion-set 
approach (Bond et all (|l99ll ). lLacev fc Cole! |l993|)) is used 
to filter the halos. The locations of the halos are adjusted us- 
ing their first-order displacements obtained from the linear 
velocity field. Finally, a similar filtering procedure is used 
to obtain an ionization field. Unlike the halos, the bubbles 
are allowed to overlap and the excursion set barrier depends 
on ionization efficiency. The 21cm brightness temperature is 
then computed as: 

T h {y) = ^-^(l-e--o) 



9(1 + 2) 1/2 :thi(1 + M 



H 



d,Vr/dr + H 



mK 



(1) 



where Ts is the gas spin temperature, t„ is the optical depth 
at the 21-cm frequency uo, S n i is the physical overdensity, H 
is the Hubble parameter, dv r /dr is the comoving gradient of 



the line of sight component of the comoving velocity, and all 
the quantities are evaluated at redshift z—uq/u — 1, under 
the approximations that Ts » T 7 and dv r /dr <C H (i.e., 
redshift space distortions are ignored). Our fiducial (200) 3 
pixel box-kindly provided to us by the authors — is 100 Mpc 
on a side (resulting in a cell size of 0.5 Mpc) at z — 8 and 
a neutral fraction xm — 0.56. The bubble size d istrib ution 
can be seen in Fig 7. of iMesinger fc Furlanettol l|2007ft and 
is peaked at ~ 8 Mpc, with a significant tail out to ~ 30 
Mpc. 

We also generate 3D realizations of this box which have 
the same power spectrum but are Gaussian random fields. 
These realizations do not contain ionized bubbles and have 
a different probability distribution function than the 21cm 
box; in particular there is no delta function distribution 
of fully ionized pixels. We found that the foreground sub- 
traction affected the spherically averaged power spectrum 
and the average line of sight power spectrum similarly for 
both the initial 21cm box and the gaussian realization boxes. 
Hence, we were able to use the realization boxes for Monte- 
Carlo calculations where many realizations with accurate 
power spectra are required, rather than repeating the entire 
semi-numeric process described above many times. More de- 
tails of this are in 



2.2 Telescope Noise 



As explained in £)2.4I in the most widely used variants such 
as polynomial fitting, foreground cleaning is a linear oper- 
ation. Thus, the impact of foreground cleaning on signal, 
foregrounds and noise can be considered separately. For the 
sake of clarity, we often do not simulate telescope noise, in 
order to clearly see the systematic biases foreground cleaning 
creates in the signal. Below, we describe how the telescope 
noise is simulated when we do consider it, in H3.3I 

For the sake of definiteness, we consider parame t ers ap - 
propriate to MWA, as specified in IBowman et al l (|2009| y 
The MWA consists of iV a = 500 tiles, each with effective area 
A c w min(16, 16(A 2 /4m 2 )). We assume a smooth antenna 
density profile n a (r) tx r~ 2 within a 750m radius, with a core 
density of one tile per 36 m 2 . We assume the sky-dominated 
system temperature to be T sys ~ 440[(1 + z)/9] 2,6 K, and 
that a single field of view is observed for t = lOOOhr (this 
is somewhat optimistic, but does not affect our basic con- 
clusions). The baseline number density is then: 

n b {U, u) = C b /' m * X 27rrdrn a (r) [** d0n a (|r - AU|), (2) 
Jo Jo 

where C b is a frequency dependent normalization (since U oc 
v) such that / dUn b {U,v) = N a (N a - l)/2. The rms noise 
per visibility per frequency channel is then: 



AVk (U, v) = 



A 2 T S , 



:K. 



(3) 



where 8v is the frequency bin width, and tu is the integra- 
tion time for that visibility. The average i ntegration time 
that an array observes the visibility U is ( Mc Quinn et al.l 
l2006h : 



7 Rece ntly, an even more r apid variant of this algorithm, 21cm- 
FAST (Mesing er et al . 2010), has been introduced, which achieves 
dramatic speed-ups by bypassing the halo finder. 



tu ~ -r^t n b (U, v). 
A^ 



(4) 



For each frequency channel, we draw complex visibilities 
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from Gaussian distributions with rms given by equation 
(3), and respecting V(U,v) = V*(— U, v). We then inverse 
Fourier transform each slice to obtain the noise cube. 



2.3 Foregrounds 

Foreground brightness temperature fluctuations will be 
larger than the cosmological 21-cm signal by several or- 
ders of magnitude. The three main sources of foreground 
contamination of the 21-cm signal are Galactic synchrotron 
(which comprises ~ 70%), extragal actic point sources (27%) 
and Galactic bremsstrahlung (1%) |Shaver et alJll999h . The 
frequency dependence of these foregrounds can be ap- 
proximated by power laws with a runn ing spectral index 
jShaver et al.ll 19991 ; ITegmark et alj|2000h . While the sum of 
power laws is not in general a power law, over a relatively 
narrow frequency range (such as that considered in this pa- 
per where Av>/v C 1), a Taylor expansion around a power 
law can be used to describe the spectral shape. We there- 
fore also approximate the sum of foregrounds as a power 
law with a running spectral index, and specialize to the case 
of Galactic synchrotron emission, which dominates the fore- 
groundfl For more detailed foreground models including 
synchrotron emission from discrete sources such as super- 
nova re mnants, and free- f ree emission from diffuse ionised 
gas, see I Jelic et all |2008h . We assume that the brightest 
point sources have been removed and only consider unpo- 
larised foregrounds. 

The intensity of the Galactic synchrotron emission 
varies as a function of both sky position and frequency. We 
model the frequency and angular dependence of galactic syn- 
chrotron foreground emission as follows. We first construct 
a realisation of the angular fluctuations in the foreground 
(around the mean brightness temperature) at a particular 
frequency fo using the relation 



2 Ci(vo) 
2tt 



2-P 



rr>o) 



(5) 



where the frequency dependence of the fluctuation ampli- 
tude is given by 



7? y >o) = ^ yr 



I'O 



150 MHz 



-a Hyn -Aa Byn log 10 [y /(150MHz)] 



(6) 



where a syn , Aa syn have mean values a s vn = 2.55, Aa avn 



0.1, P = 2.5 and A° yn = 25 K (shaver et all 1 19991 ; 
ITegmark et~aHl2000] ; IWang et al.ll2006h . The latter two val- 
ues are extrapolated from 30 GHz CMB observations. To 
model spatial variations, we draw the parameters a syn and 



8 The primary effect of other foregrounds will be to slightly alter 
the spectral index and the frequency dependence of the spectral 
index, and also (in the case of unresolved radio point sources) the 
angular structure of the foreground at small scales. Neither of 
these effects is important as our foreground removal technique is 
not sensitive to the specific value of the spectral index we adopt. 
As for angular fluctuations, these correspond to zero-point fluc- 
tuations along different lines of sight, which are immediately re- 
moved by foreground cleaning. We have experimented with in- 
creasing angular fluctuations by a factor of 10, and found little 
difference. 



Aa syn from a gaussian distribution with standard deviation 
~ 10% of the mean. 

We use the angular power spectrum to generate 
a two-dimensional realisation of brightness temperature 
fluctuations AT ayn (9) as a function of angular position 
6 to which we add a mean sky brightness T syn = 
165(^o/185 MHz) -2,6 K (although interferometers will gen- 
erally not be sensitive to the temperature zero-point). We 
then extend this foreground plane into three dimensions by 
extrapolating along each line of sight: 

/ \ -a syn -Aa Byn logi [i//i/ ] 

T syn {6, v) = [T syn + AT Byn (8)] I — J 

(7) 

Our results are not sensitive to the details of the assumed 
foreground model. For instance, an increase in the amplitude 
of foreground brightness temperature fluctuations, by an or- 
der of magnitude, results in no change to our results after 
foreground cleaning. The foreground completely swamps the 
HII region signal, which is not visible in the combined maps. 
The mean and standard deviation of the foreground bright- 
ness (at the central frequency 158 MHz of these cubes) are 
2.5 x 10 5 mK and 1.4 x 10 4 mK respectively, which should be 
compared to the ~ 10 mK 21-cm fluctuations. Note that our 
simulations exclude the mean sky brightness, which cannot 
be measured by an interferometer. 



2.4 Foreground Removal 

We employ here standard foreground cleaning tech- 
niques which hav e been promulgated i n the literature 
dWang et ail 120061 ; iMorales et ail l2006bl; [McQuinn et"afl 



20061 ; iGleser et alj 120081 ; iJelic et alj |2008| ; iBowman etafl 
2009): fitting a smooth function, such as a low-order poly- 



nomial, to the data, and subtracting it off, under the im- 
plicit assumption that the foreground along the line of 
sight varies much more slowly than the cosmological signal. 
There have been recent attempt s to develop non-parametric 
techniques l|Harker et alj [20091 1 . which show promise, but 
at present yield no decisive advantage, whilst consider- 
ably complicating error analysis. We therefore only dis- 
cuss standard parametric techniques. We also only con- 
sider foreground subtraction in image space, rather than 
Fourier space, since the two processes have been shown 
to yield fairly similar results (but note that such issues 
become more subtle once the fre quency depend e nce o f 
uv coverage is taken into account (|Bowman et al ] (|2009T ); 
iLiu et al. ( 2009a)l ; IlTu" et al (2009bl ). 

Let us write the measured brightness temperature fluc- 
tuation in a pixel as: 



x(6, v) = s(6, v) + n(6, v) + f(6, v) 



(8) 



where s, n, f is the contribution from cosmological signal, 
telescope noise, and foregrounds respectively. In principle, 
our knowledge that the Galactic foreground is approxi- 
mately a power-law might lead us to suppose that a fitting a 
log(z/)-log(a;) polynomial would be optimal, since that rep- 
resents a Taylor expansion about a power law. In practice, 
w e find that av — x polyn omial gives very similar results. As 
in lMcQuinn et aT] (|2006T l. we therefore choose to work with 
these functions, since this implies that foreground cleaning 
is a linear process. We use a Gram-Schmidt procedure to 
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form an orthogonal set of basis functions from the polyno- 
mial basis 1, x, ...x n . Along each line of sight, our best fit to 
the foreground is: 



x = Pq 



(9) 



where x = {xi} is an n-dimensional vector consisting of 
the number of fitted frequency bins, P is an n x m matrix 
composed of polynomials of order {0, 1, m} evaluated at 
frequencies {i}, and q is the m-dimensional vector of param- 
eters (polynomial coefficients) to be estimated. If we now 
minimise % 2 = ( x — x ) T (x — x), we obtain the simple linear 
estimator for the polynomial coefficient^]: 



q= (P T P) _1 x. 
The fit to the foreground is then: 

x = P(P T P) _1 x. 



(10) 



(11) 



which implies that the foreground cleaned data is: 

x = x-x= [l-P(P T P) _1 j x = nx, (12) 



i.e. foreground cleaning can be represented by a linear pro- 
jection operator II. In this case, x = s + n + f , and the 
covariance matrix of the cleaned data separates: 



(x T x) 



S + N + F. 



(13) 



This is an extremely convenient property we will exploit, 
since it means we can consider the impact of foreground 
cleaning on the boxes of the cosmological signal alone, 
without worrying about interaction with the noise or fore- 
grounds. We shall assume that the properties of the noise is 
fully known, so that N is fully specified a priori. 

For a linear foreground cleaning process such as equa- 
tion (|12|) . there are two possible sources of systematic biases. 
The first is that foreground cleaning is inefficient, and that 
the requirement that (ff ') <C (ss^) is not met; foregrounds 
residuals are still a significant contaminant. However, the ex- 
pected spectrally smoothness of foregrounds (see Appendix 
[S]and|B| makes this unlikely. For the foregrounds we simu- 
lated, the cleaning procedure always left foregrounds residu- 
als which were negligible compared to the signal. The second 
is that the process of cleaning itself, s — > s, introduces sys- 
tematic biases in the signal. The latter is more serious, and 
can bias astrophysical and cosmological inferences, unless it 
is correctly understood. 

Thus, we will generally illustrate the effects of fore- 
ground cleaning on the signal alone, without showing its 
impact on noise (note that foreground cleaning will have the 
same effect on noise as the signal) or foregrounds (which are 
always cleaned to a negligible level). We also do not depict 
the finite angular resolution of the telescope in our simula- 
tions. Whilst the effect of foreground cleaning remains the 
same (since it is a linear process), its impact is more diffi- 
cult to assess visually in smoothed maps, although it clearly 
emerges in statistical measures such as the power spectrum. 

9 Such a minimization implicitly assumes that the 'noise' to the 
fit— which is simply cosmological signal and telescope noise-is pro- 
portional to the identity matrix. This is approximately true for 
telescope noise-since different frequency slices are uncorrelated — 
but a poorer assumption for the cosmological signal. We defer 
further discussion of this issue to [[6] 
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Figure 1. Top panel: the ID foreground cleaning window func- 
tion, as given by equation 115B . The wavenumber corresponding 
to the box size is marked by the line labeled fc m i n . As the order of 
the polynomial increases, power at progressively smaller scales is 
subtracted from the box. Also shown is for a box which has 

a bandwidth 3 times larger. It is identical to the window function 
for the smaller box, except that k — > k/3. Bottom panel: the 3D 
foreground cleaning window function, as given by equation 1181 
Aliasing from the line of sight means that power is subtracted 
across a much wider range of scales. Otherwise, similar observa- 
tions apply. 



The effect of foreground cleaning and beam smoothing can 
be thought as of as a series of linear operators which alter 
the pristine signal; primarily on large scales and small scales 
respectively. Here, we focus on the former. 



3 EFFECT OF FOREGROUND REMOVAL ON 
THE POWER SPECTRUM 

3.1 Aliasing 

Survey geometry and foreground cleaning modify the ob- 
served power spectrum. A Fourier mode with wavevector 
k will be convolved with a window function W(k,k') = 
WgcomWf g _ rm , where W g00 m is the window function due 
to the finite survey geometry, and Wf g _ rm encapsulates the 
effects of foreground removal. For a survey with cylindrical 
geometry (L, R), where L is the depth of the survey (given 
by the bandwidth B of the survey) and radius R is given 
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by the field of view of the instrument, the survey window 
function is: 

W geom (q) = -^Jo(9«i/2)Ji(9rH) (14) 
q r H 

where q = k — k'. This expression reflects the coupling be- 
tween a wavevector k and all other wavevectors k' . In prac- 
tice, survey geometry plays a much smaller role in modifying 
the power spectrum than foreground cleaning. 

What is the effect of foreground cleaning on Fourier 
modes? Let us assume that foreground cleaning is done 
on a voxel by voxel basis in the frequency direction, as in 
equation (|12l) . i.e. its effect is equivalent to a linear projec- 
tion operator II operating in the z direction. This clean- 
ing process eliminates the orthonormality o f Fourier modes 
Uk^x exp(ifc z z). Following the notation of iMcQuinn et ahl 
(2006), the new Fourier basis is fit = Il/ife* , and the ID 
window function is simply: 

(k z ,k' z ) = / dzp,kxp>t' ■ (15) 
Jo 

We show this for foreground cleaning over an L=f00 cMpc 
interval at z~8 in the top panel of Fig. fT]), for polynomi- 
als of different order. Foreground cleaning causes a sharp 
reduction of power at scales comparable to the bandwidth 
over which the cleaning is done, fc m i n w 2n/L. As the or- 
der of the polynomial is increased, power on progressively 
smaller scales is removed. Also shown is W^, when fore- 
ground cleaning is performed over a bandwidth three times 
larger. It is identical to the window function for the narrower 
bandwidth, except that k — ¥ fc/3. Note that Wi D (k z ,k' z ) is 
actually a matrix; here we only depict the diagonal terms. In 
practice, the low sensitivity of first-generation experiments 
requires that the power spectrum can only be computed in 
a few large bins, and so the effect of mode-mode coupling is 
negligible. 

How is the 3D angle-averaged power spectrum affected 
by foreground cleaning? It i s worth reviewing some basics 
of power spectrum aliasing (|Kaiser fc Peacock! Il99ll ). The 
ID power spectrum is simply a projection of the 3D power 
spectrum: 



Increased Variance Due to Foreground Cleaning 



Pid(Moc J Pso(k)dk x dk y <x J P 3D (y / fc| + k 2 ± )k ± dk A 



Thus, we obtain: 



Aid(&z) oc k z J A 3D (fe)fe dk 



(16) 
(17) 



where 



A 2 D (fc z 



fc 3 P 3 D(fc)/(27T 2 ) 



AL(fc) = 



= kzPio(kz)/iT, and 
Thus, ID modes receive contributes 
from all wavenumbers with k fc z . So small fc z modes 
can receive contributions from modes with large k (which 
are oriented almost perpendicular to the line of sight). 
Conversely, power which is subtracted from small k z modes 
can affect the 3D power spectrum at large k. 

If we define = k z /k, then the window function for the 
3D angle-averaged power spectrum is simply given by: 



WSb{k) 



Wtu(k^) dfi. 



(18) 



This is shown in the bottom panel of Fig. (flj, for our box of 
100 cMpc (results corresponding to the flattened geometry 
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Figure 2. The increase in the 1 a error on a power spectrum 
measurement, when foreground cleaning is performed, relative 
to an ideal measurement without foregrounds). For increasingly 
agressive foreground cleaning (i.e., a higher order polynomial), 
the number of measurable modes falls, and the variance of the 
measurement increases. The wavenumber corresponding to the 
box size is marked by the line labeled k mln . Also shown is the 
increase in 1 a error for a box which has a bandwidth 3 times 
larger. 



of surveys will be shown in £J3.3[) . The scaling properties are 
very similar to those of W^, except that power is subtracted 
across a much wider range of scales, as expected from the 
discussion above. 

Thus, foreground cleaning causes suppression of power 
across a wide range of scales in the angle-averaged 3D 
power spectrum. We will see how this can be dealt with in 
£13.31 However, even if the systematic bias can be appropri- 
ately dealt with, the elimination of low-wavelength modes 
in the radial direction unavoidably reduces the number of 
modes available to measure the power spectrum at a given 
wavenumber. The fraction of measurable modes is given by 
Ws£>(k); hence, sample variance is increased by foreground 
cleaning by a factor l/ A /W 3 2 D (fc). We show this in Fig. ([2]). 
As expected, for increasingly agressive foreground clean- 
ing (i.e., a higher order polynomial), the number of mea- 
surable modes falls, and the variance of the measurement 
increases. Thus, in order to measure a given wavenumber 
k, one would have to perform foreground cleaning over a 
bandwidth corresponding to a significantly larger length- 
scale than L = 2-n/k. This increase in sample variance due 
to foregroun d cleaning is als o evid ent in the Fisher matrix 
formalism of IMcQuinn et all (|2006l "). 

How accurate are these analytic expectations? In Fig 
[3] we compare the power spectra predicted by these formu- 
lae (lines), to the power spectra measured directly from the 
box after foreground cleaning (points). For completeness, 
we also show the impact of foreground cleaning on the 2D 
power spectrum in the plane of the sky. The analytic and 
measured power spectra show good agreement. In particu- 
lar, whilst foreground subtraction only affects large scales in 
ID, aliasing results in broad suppression of power across a 
range of scales in 2D and 3D. 

Note that in making these comparisons, no foreground 
has been previously added to the 21cm box. If foregrounds 
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Figure 3. The ID, 2D and 3D power spectra after foreground 
cleaning by n = 2, n = 3 polynomials. The lines are the analytic 
calculation P 1D (k) = W? P 1D (k), P 3D {k) = W 2 D P 3D (k), where 
W^ D , Wg D is given by equation (1 1 5 I t , J 18D respectively. The points 
are the power spectra measured from the foreground subtracted 
box. The two show good agreement. Whilst foreground subtrac- 
tion only affects large scales in ID, aliasing results in broad sup- 
pression of power across a range of scales in 2D and 3D. 
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Figure 4. The effect of foreground cleaning on the power spec- 
trum of the foreground itself. Unlike for the signal (and noise) 
power spectra, fitting a smooth function such as a polynomial 
suppresses power on all scales for the smooth foreground. These 
effects also carry on to 3D (not shown). 



are initially added, the power spectrum for n = 3 polyno- 
mial subtraction is unchanged, but for n — 2 polynomial 
subtraction, the power spectrum blows up at large scales: 
foregrou nd cleaning is insuffic iently aggressive. As empha- 
sized bv lMcQuinn et all (|2006l ). one always chooses the low- 
est order polynomial that is able to remove foregrounds well 
below the cosmological signal. 

Given these considerations, one might be tempted to 
foreground clean by applying a sharp step function in k z 
space, rather than fitting a polynomial. The sharp decay 
in power at low wavenumber due to foreground cleaning by 
a polynomial might more simply be accomplished by ex- 
cluding all modes with fc z < fc z . cu t, where fc z , cu t is some 
critical wavenumber above which the foreground has little 
power. Indeed, we shall shortly see that marginalizing over 
modes with fc z < fc ZjCUt can eliminate systematic bias in 
power spectrum estimation. However, we should keep two 
facts in mind. Firstly, the (close to power-law) foreground 
power spectrum cannot be fully represented by a few Fourier 
modes in the line of sight; there is still power at high k z . 
Secondly, the effect of foreground cleaning can be repre- 
sented by a window function P c ican(fc) = W 2 (k)P(k) only 
for a field which is reasonably close to random phase; ap- 
plying a 'matched filter' such as a smooth polynomial to 
spectrally smooth foregrounds brings about a much greater 
reduction of power on all scales. These effects are shown in 
FigU Note that foreground cleaning can be represented as 
Jclcan (fc) = W 2 (k)P(k) for the noise, which is a Gaussian 
random-phase field. 



3.2 Insensitivity to Non-Gaussianity of Signal 

The best means of performing error analysis for the com- 
plex data analysis pipeline in 21cm experiments is to per- 
form Monte-Carlo simulations; in this way both systematic 
and statistical errors introduced at various stages can be 
fully understood. It is time-consuming to produce a new 
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simulation box of 21cm signal for each Monte-Carlo real- 
ization, as is necessary to take cosmic variance into ac- 
count. While extremely fast means of creating approximate 
21cm realizations for a given ionization parameter now exist 
|Mesinger et alj|20ich . these cannot be tuned to accurately 
represent different realizations of some specified underlying 
power spectrum, rather than the power spectrum of the par- 
ticular theoretical model used to generate the box. The easi- 
est way to go about this would be to simply create Gaussian 
realizations of a given power spectrum. 

However, it is not clear whether foreground cleaning will 
have the same impact on the power spectrum of a Gaussian 
and non- Gaussian field (such as a 21cm box). Least squares 
is an optimal means of fitting a parametric function when 
the residuals are Gaussian; however, it can be sub-optimal 
when residuals are non-Gaussian (for instance, when there 
are outlier data points which are highly unlikely in a Gaus- 
sian distribution; these will disproportionately affect the fit). 
In this case, the residuals, which constitute the 21cm signal 
itself, are non-Gaussian. In particular, there is strong phase 
coherence across ionized regions, and sharp features due to 
the boundary between ionized and neutral regions. 

In Fig [5] we compare the effect of foreground subtrac- 
tion on the power spectrum of the full 21cm box with non- 
Gaussian signal and on Gaussian realizations of this box, 
which have identical power spectra. For the Gaussian case 
the power spectrum is averaged over 10 realizations, to re- 
duce sample variance. Foreground subtraction has identical 
impact on the power spectra of the full 21cm box and Gaus- 
sian realizations, which implies that Gaussian realizations 
can be used for rapid Monte-Carlo simulations. The reason 
for this identical impact is likely because non-Gaussianity 
is important on scales smaller than those affected by fore- 
ground cleaning. This is also the reason why the analytic 
calculation P c lean(&) = Wi D (k)P(k), where W 3 2 D (fc) is given 
by equation (|18[l . closely corresponded to the box simula- 
tions in Fig [3] 



Comparing Initial Box and Gaussian Realization Box 



3.3 Eliminating Systematic Bias; Error Estimates 

As previously discussed, excising modes with k < fc z , cr it will 
have a similar effect on the signal and noise power spec- 
tra (though not on the foreground power spectrum) as fore- 
ground cleaning via fitting and subtracting a polynomial. In 
this case, Wi D will simply be a step function: W^n = for 
k ^ fc z , cr it, W\t> — 1 f° r k > fcz.crit- From equation (fT8|) . 
this implies that W| D (fc) w 1 — /Ucrit = (1 — fc z ,crit/fc). For a 
judiciously chosen fc cr it, this approximates the effect of poly- 
nomial fitting quite well. In FigO we show the effect of mode 
excision for modes with k z ^ 0.15 Mpc~ x on our 21cm box; 
from Fig [3l this is the wavenumber at which the ID power 
spectra starts plummeting rapidly after foreground cleaning, 
for an n = 3 polynomial. As can be clearly seen, a sharp k- 
space cutoff mimics the broad suppression of power in 2D 
and 3D extremely well. Thus, if after foreground cleaning 
we simply use uncontaminated modes with k > & z , cr it to es- 
timate the power spectrum (i.e., we marginalize over modes 
with k <, fc z ,crit), we should have an unbiased estimator of 
the power spectrum, even at low k. At the same time, as 
seen in Fig. [2] there will be unavoidably larger variance in 
the power spectrum estimate, since there are fewer indepen- 
dent modes sampling a given wavenumber. 
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Figure 5. The effect of foreground subtraction on the power spec- 
trum of the full 21cm box with non-Gaussian signal and on Gaus- 
sian realizations of this box, which have identical power spectra. 
For the Gaussian case the power spectrum is averaged over 10 re- 
alizations, to reduce sample variance. Foreground subtraction has 
identical impact on the power spectra of the full 21cm box and 
Gaussian realizations, which implies that Gaussian realizations 
are adequate for rapid Monte-Carlo simulations. 



We test these ideas out on a simulated field of view 
appropriate to the MWA, which is much more extended in 
the plane of the sky than in redshift space. We consider an 
MWA field which is ~ 800 deg 2 and ~ 6 MHz at z = 7.3; 
this corresponds to a rectangular box 4350 cMpc by 4350 
cMpc by 100 cMpc. To generate the simulated field, we 
use the fact that foreground cleaning is insensitive to the 
non-Gaussianity of the signal ( H3.2|) . and simply generate 
a Gaussian random realizatio n of a given pow er spectrum. 
We use the power spectrum of lLidz et al.l |2008r ) at z = 7.32, 
thi = 0.46, which is derived from radiative transfer simu- 
lations in a 130 Mpc h^ 1 box. MWA will only be sensitive 
to power spectrum measurements over roughly a decade in 
scale, k ~ 0.1 — 1/iMpc -1 . Measurements even over this 
limited wavenumber range are valuable, since the redshift 
evolution of the amplitude and slope of the pow er spectrum 
contains valuable information (|Lidz et al.ll2008h . The lower 
limit is given by foreground cleaning considerations. The 
upper limit is given by telescope resolution: on small scales, 
the telescope samples relatively few modes (mostly along the 
line of sight) due to finite angular resolution, and the noise 
blows up. Given these considerations, we divide our box into 
512 by 512 by 32 cells, which corresponds to cells of size 8.5 
by 8.5 by 3.1 cMpc. Finer resolution is unnecessary over this 
range of wavenumber. 

We divide the wavenumber range into 5 logarithmically 
spaced bins. In Fig.[6l we show how the systematic suppres- 
sion of power due to foreground cleaning can be correctly 
accounted for by marginalizing over the appropriate modes, 
i.e., only including modes with fc z > fc z , cr it in the calculation 
of the power spectrum. In the top panel, we marginalize 
over the first k z bin; much of the systematic bias is elimi- 
nated. In the lower panel, we marginalize over the first two 
fc z bins. Although all of the systematic bias is removed, the 
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Figure 6. The 2D and 3D power spectra after all modes with 
kz ^ 0.15Mpc _1 are set to zero; from Fig. [3] this is the wavenum- 
ber at which the ID power spectra starts plummeting rapidly af- 
ter foreground cleaning, for an n = 3 polynomial. Indeed, such 
a sharp k-space cutoff mimics the broad suppression of power in 
2D and 3D extremely well. 



power spectrum is measured over a somewhat smaller range 
of scales. 

Let us now run more detailed calculations to confirm 
that such a marginalization procedure is able to return unbi- 
ased power spectrum estimates with the minimum variance 
error bars expected from a Fisher matrix calculation. For 
the la tter, we use the expressions derived bv lMcQuinn et all 
(2006). For wavenumbers k z , k' z at which the foregrounds can 
be cleaned well below the signal, the Fisher matrix is: 



T 2 

1 N 



(/4f)(f f 



fj-k' 



\(T 2 N +f 2 ) 



(19) 



where w = X 2 B 2 /(A e D 2 AD), D is the comoving distance 
to the center of the survey at redshift z, AD is the comov- 
ing depth of the survey, and Tn = AVn(U,^) as given by 
equation ([3]). This expression assumes that foregrounds are 
cleaned well below the signal, and that detector noise dom- 
inates over the signal, which is true for the first generation 
of instruments. The variance in a power spectrum estimate 
for a single k mode, with line of sight component k z — fik, 
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Figure 7. Removing the systematic effects due to foreground 
cleaning by marginalizing over modes which have k z < fe z C rit- 
The calculation is performed for survey geometry appropriate to 
the MWA, in 5 logarithmically spaced bins, and only modes with 
kz > k z cl .; t arc included in the calculation of the power spectrum. 
In the top panel, we marginalize over the first k z bin; much of the 
systematic bias is eliminated. In the lower panel, we marginalize 
over the first two k z bins. Although all of the systematic bias is 
removed, the power spectrum is measured over a smaller range of 
scales. 



is then: 



al{k^) = (F k 



(20) 



Note that this effectively marginalizes over modes removed 
by foreground cleaning, since for such modes pby — > 0, 
and hence ap(k,fi) — > oo. The variance on the angle- 
averaged power spectrum over a spherical shell of logarith- 
mic width e = dink is then given by adding the error for 
individual fc-modes in inverse quadrature: 



Afj, 



4tj-2 



(21) 



where V B ui = D 2 AD(\ 2 /A e ) is the effective survey volume, 
and the sum over /j, is over all modes in the upper half plane 
permitted by survey volume. The lower bound is set by the 
survey depth, while the upper bound is set by the telescope 
resolution. In practice, we do not sum over all Fourier cells 
in an annulus of constant (fc,/i), but instead discretize k 
and physically count all modes within a given bin k — Ak < 
k| < k + Ak. This gives more accurate results in mode- 
counting, particularly at low wavenumber when the number 
of available modes is small. Note that the effective number of 
sampled modes is reduced at low k after foreground cleaning 
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(since modes with fc/i < fc z , C rit have <Tp(fc,/i) — ¥ oo), and so 
Op increases at low k, as was shown in Fig. [2] 

We compare these Fisher matrix estimates against 
Monte-Carlo simulations. We first generate a Gaussian re- 
alization of MWA 21cm signal boxes, as described above. 
We also add telescope noise, as described in ij2.2l and fore- 
grounds, as described in H2.3I We foreground clean with an 
n — 3 polynomial, and then perform a maximum likeli- 
hood estimate of the power spectrum Pi in 5 logarithmi- 
cally spaced bins from k = 0.1 to k = 0.7, the approximate 
range of wavenumbers accessible by MWA. The Gaussian 
likelihood is: 



C(Pi\d) = 



exp(id T C _1 d) 
(2 7 r) JV / 2 (detC) 1 /2- 



(22) 



where d = <5(k) is the data in Fourier space, and the covari- 
ance matrix C = S + N is assumed to be diagonal, where 
S(ki,kj) = P2icm(ki)5ij , and the noise covariance matrix is: 



In order to marginalize over modes which have been fore- 
ground cleaned, we set N;i to a very large number, for modes 
where fiki < fc z , m in. Since they are assigned a large variance, 
these modes are then given virtually no weight in the likeli- 
hood minimization. We set fc z , m m to the first bin of k z , as in 
the top panel of Fig(60- We perform the multi-dimensional 
minimization of the negative log likelihood, / = — 2bxC, via 
a Neder-Simplex algorithm. We perform 200 Monte-Carlo 
simulations, and compute the mean and variance of our de- 
rived maximum-likelihood estimates, comparing them to the 
true input power spectrum and the Fisher matrix estimates 
for the variance respectively. 

The results are shown in Fig. [8] The mean of the max- 
imum likelihood estimates for the power spectrum shows 
excellent agreement with the input power spectrum. This 
implies that our marginalization procedure enables unbiased 
estimates of the power spectrum, despite the fact that unfet- 
tered foreground cleaning suppresses power across a broad 
range of scales. The variance of the power spectrum esti- 
mates also shows good agreement with Fisher matrix esti- 
mates, implying that we have a minimal variance estima- 
tor of the power spectrum which respects the Cramer-Rao 
bound. Note that we plot an asymmetric error bar on the 
final data point to avoid an error bar depicting negative 
power. Overall, mode marginalization is able to robustly 
handle the broad suppression of power due to foreground 
cleaning, which should therefore not be a problem. 



4 EFFECTS OF FOREGROUND REMOVAL 
ON TOMOGRAPHY 

3D tomography is precluded for the noise levels and angular 
resolution of the first generation of instruments, except for 
perhaps the exceptionally large bubbles blown by quasars; 
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Figure 8. The input power spectrum with error bars computed 
from the Fisher matrix, compared against the derived mean and 
standard deviation of the maximum likelihood solution of 200 
Monte Carlo simulations of an MWA field. The two show ex- 
cellent agreement. In particular, the fact that the mean derived 
power spectrum equals the true input power spectrum implies 
that our marginalization procedure enables unbiased estimates 
of the power spectrum, despite foreground cleaning, whilst the 
close correspondence with Fisher matrix error bars implies that 
we have a minimum variance estimator. 



Slice in Plane of Sky 




Figure 9. Top panel shows the effect of foreground cleaning on a 
slice in the plane of the sky, while bottom panel shows the effect 
of foreground cleaning on a slice perpendicular to the plane of the 
sky. For both panels, left is true signal, center is n = 3 polynomial 
fit, right is n = 2 log fit. For a slice in the plane of the sky, the 
contrast between the bubbles and the neutral IGM appears to be 
reduced, although the topology of ionized regions is preserved. 
For a slice perpendicular to the plane of the sky, the reduction 
of contrast is also present, but to a lesser degree. Striae in the 
frequency direction become apparent. 



Note that using the first bin leaves us still with a very slight 
systematic bias, as is evident in the top panel of Fig[6] This is 
of course unnecessary and could have been avoided by a more 
judicious choice of fc z>m i n . 



Systematic Effects of Foreground Removal in 21cm Surveys of Reionization 11 



l|Geil fc Wvithell2008r i. Nonetheless, the origin of the sys- 
tematic artifacts introduced by foreground subtraction can 
perhaps best be understood in the image plane. We once 
again use the fact that polynomial fitting and subtraction 
can be represented by a linear projection operator f £|2.4[) . 
and we therefore consider the impact of foreground clean- 
ing on boxes of cosmological signal and foreground, without 
noise. Even in the presence of noise, the systematic artifacts 
due to the suppression of power in the cosmological signal 
will still be there (it will just be more difficult to pick out 
by eye). 

In Fig. [9l we show the effect of foreground cleaning on a 
slice of the box both parallel and perpendicular to the plane 
of the sky, for both a third order polynomial in v and a 
second order polynomial in log(i^) (we perform the latter to 
show that its effects are very similar to those of a polynomial 
in v. Of course, its use is deprecated since it cannot be rep- 
resented as a linear projection operator). For a slice in the 
plane of the sky, the contrast between the bubbles and the 
neutral IGM appears to be reduced, although the topology of 
ionized regions is preserved. This reduction of contrast due 
to foreground cleaning will bedevil attempts to measure the 
'step' in brightness temperature across an ionization front 
and hence potentially measure the neutral fraction, a fact 
which we previously pointed out in iGeil et all (|2008l ) . For 
a slice perpendicular to the plane of the sky, the reduction 
of contrast is also present, but to a somewhat lesser degree. 
Instead, striae in the frequency direction become apparent. 
These are obviously due to inaccuracies in the fit to the 
foreground in the frequency direction. 

These systematic artifacts are more apparent if we con- 
sider the residual errors in the foreground fit, or <5fg = 
fgtruc — fgflt , which we show in Fig. 1101 From the top panel, 
which shows residuals in the plane of the sky, we see a strik- 
ing result: foreground fitting errors are not random, but in- 
stead highly correlated with the topology of ionized regions. 
In fact, errors in foreground fitting trace out the distinctive 
structure of HII regions remarkably well. Note that <5fg rep- 
resents the difference between a (close to) power-law and a 
low order polynomial, so there is very little small scale power 
in the frequency direction, as can clearly be seen in the lower 
panel, which shows residuals perpendicular to the plane of 
the sky. However, closer examination of the bottom panel re- 
veals that errors in foreground subtraction along the line of 
sight in fact also correlate with large scale HII regions. The 
foreground fit is 'pulled down' or 'pushed up' by large scale 
HII regions: ionized pixels are a source of highly correlated 
noise, since they tend to cluster together, and the polyno- 
mial fit to the foreground responds by curving incorrectly. 
We show an example of this along one line of sight in Fig. 
1111 This small error in the foreground fit in the frequency 
direction occurs with sufficient consistency to produce the 
high correlated residuals we see in the plane of the sky in 
the top panel. Indeed, we see that that a substantial part 
of the signal has been subtracted off: the residuals show 
greater fidelity to the true signal (and the ionized regions 
have more correct contrast with the neutral regions) than 
the foreground subtracted box. 

There are two lessons for us from this. Firstly, it is the 
presence of large HII regions which cause curvature in the 
foreground fit. In a sense, this is unsurprising, since it is 
the large HII regions which induce large scale power. If HII 
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Figure 10. Error in foreground subtraction (<5fg = fgtrue — fgflt 
in the plane of the sky (top panel) and perpendicular to the panel 
of the sky (bottom panel). As in Fig. [9] left is true signal, center is 
n = 3 polynomial fit, right is n = 2 log fit. In the plane of the sky, 
the errors in foreground subtraction are highly correlated with HII 
regions. Perpendicular to the plane of the sky, long wavelength 
errors in foreground subtraction create striation in the frequency 
direction, which is also correlated with large HII regions. 
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Figure 11. The effect of foreground subtraction on a single line of 
sight. Errors in foreground subtraction are highly correlated with 
HII regions, which create downward curvature in the subtracted 
signal. 



regions were small compared to the bandwidth we fit over, 
then there will be little distortion due to foreground fitting. 
Thus, as reionization progresses, and the typical bubble size 
increases, we need to fit over progressively larger and larger 
bandwidths. We discuss this further in £ ]6.1I Secondly, if we 
could identify the largest HII regions and use them as cali- 
bration points (we know that there is no 21cm signal there, 
so after foreground subtraction 8Tt — 0; in particular, all 
HII regions should be renormalized to the same level after 
foreground subtraction) , this could greatly attenuate the dis- 
tortions introduced by foreground cleaning. We discuss this 
possibility further in in ^6.21 
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Figure 12. Left Panel: The original signal PDF, and the PDF 
after foreground subtraction with an n=3 polynomial. Note that 
the mean temperature (which is not observable by an interfer- 
ometer) has been subtracted from the signal PDF. Foreground 
cleaning considerable distortion of the PDF. Right panel: com- 
parison of the foreground- cleaned PDFs for two different sets of 
basis functions. Both introduce comparable amounts of distortion. 




Figure 13. Left: the original box of 21cm signal. Right: the 
scrambled box obtained by bootstrap resampling from the PDF 
of the original box. Such a box has an identical PDF, but a very 
different power spectrum. 



5 EFFECT OF FOREGROUND REMOVAL ON 
PDF 

In principle, the 21cm PDF — and in particular its redshift 
evolutio n — contains a wealth of information about reion- 
ization (Furlanetto et all |2004|; IWvithe fc Morales! 120071 ; 



lHarker et al.1 120091 : llchikawa et all 12010] '). In the left panel 
of Fig. 1121 we show our original signal PDF in blue, before 
foreground subtraction. A striking feature of the 21cm PDF 
is its bimodality: there is a delta function spike of ionized 
voxels, and a broad clump of neutral voxels, which is pri- 
marily due to density fluctuations. If one can measure the 
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Figure 14. Power spectrum of the original and scrambled 
bax, both before (solid) and after (dashed) foreground subtrac- 
n. The scrambled box has a 'white noise' power spectrum 
(fc) Pdconst=^ A 2 oc fc 3 , which implies that it has much less 
large scale power, and hence is much less affected by foreground 
removal. 
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Figure 15. Left panel: The original PDF of the scrambled box, 
and the PDF after foreground subtraction with an n = 3 polyno- 
mial. Note that there is considerable less distortion of the PDF, 
compared to Fig ll2l in particular, the distribution of neutral pix- 
els is almost perfectly preserved, whilst the distributed of ionized 
pixels is slightly broadened. Right panel: comparison of the fore- 
ground subtracted PDFs of the original (from Fig ll2[l and scram- 
bled boxes. The PDF of the scrambled box is significantly less 
distorted. 



relative number of voxels in each distribution, it might be 
possible to measure the ionized fraction of the IGM. By do- 
ing this as a function of redshift, it would then be possible 
to chart the progress of reionization. In practice, this signal 
PDF will be convolved with telescope noise as well as the ef- 
fects of finite telescope resolution (which then creates a long 
tail of partially ionized voxels). Since both of these effects 
are well understood, it is still possible to use maximum- 
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likel ihood mixture modelin g to extract the cosmological sig- 
nal jlchikawa et al.l (|2010l ); Oh et al, in preparation). This 
is despite substantial telescope noise; by eye, after the PDF 
appears perfectly Gaussian after it has been convolved with 
telescope noise (which is why we plot the noiseless distribu- 
tion here). But the huge number of voxels means that small 
deviations from Gaussianity are detectable and have strong 
statistical significance. 

However, all this assumes that there are no systematic 
(rather than statistical) sources of noise. No work to date 
has consider the effects of foreground subtraction on the 
21cm signal PDF; all have implicitly assumed perfect fore- 
ground subtraction. In fact, the distortions introduced by 
foreground cleaning are severe, and may preclude detail- 
ing modeling and statistical inferences from the measured 
PDF. The left panel of Fig. [12] shows the foreground sub- 
tracted signal PDF in green. The bimodality of the signal 
is clearly reduced, and the PDF is strongly distorted. In 
the right panel, we should that the distorted PDFs which 
emerge from cleaning with a third order polynomial in v or 
a second-order polynomial in logf are comparable. In retro- 
spect, these strong distortions could have been anticipated 
by the results of 33 where we saw the sharp reduction in 
contrast between neutral and ionized regions. 

One could imagine at least two possible sources of this 
distortion. One is that least-squares regression is an opti- 
mal foreground estimator when noise is Gaussian. In this 
case, the 'noise', or the 21cm signal we are trying to recover, 
is highly non-Gaussian, and perhaps least-squares regression 
breaks down (as it does when there are significant outliers in 
the data). Least-squares regression might be trying to find 
the solution which has maximally Gaussian residuals, and 
thus destroys the very non-Gaussian signature we seek. In 
this case, we would want to use robust regression, minimiz- 
ing some functional other than the sum of squares. Alter- 
natively, the distortion could be due to the removal of large 
scale power, as we have seen previously. This would require 
a different damage control strategy. 

To distinguish between these possibilities, we create a 
new signal box. We bootstrap resample (with replacement) 
from the original signal box, but impose no spatial correla- 
tions between voxels. The result is the box on the right panel 
of Fig. 1131 which clearly has no large scale features, but by 
construction the same PDF as the original signal box. As 
can be seen in Fig. 1141 the scrambled box has a white noise 
power spectrum, P(k) ~const=> A 2 oc fc 3 , which implies 
that it has much less large scale power, and hence the fluctu- 
ation spectrum is much less affected by foreground removal. 
Indeed, the rms temperature fluctuations (which are dom- 
inated by small scales) are scarcely affected by foreground 
removal, unlike the original signal box. The left panel of 
Fig. [15] compares the PDF of the original and foreground 
subtracted boxes; notice that the PDF is only slightly dis- 
torted. In particular, the distribution of neutral pixels is 
almost perfectly preserved, whilst the distributed of ionized 
pixels is only slightly broadened; the bimodality of the pixel 
distribution is still strongly apparent. In the right panel, 
we overlay the foreground subtracted PDFs of the original 
(from Fig|12[) and scrambled boxes. The PDF of the scram- 
bled box is significantly less distorted. This shows that the 
distortion of the PDF during foreground removal must be 
due to the suppression of large power, since only the orig- 
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Figure 16. Foreground subtraction in a 400 3 (cMpc) 3 box where 
^HI = 0-3- Here, the characteristic bubble size is much smaller 
than the comoving width of the slice, and hence there is little 
fluctuation power on the large scales over which foreground sub- 
traction suppresses power (top panel). Hence, as in the 'scrambled 
box' of J5] there comparatively little distortion of the 21cm PDF 
(bottom panel); in particular, the distribution of neutral pixels is 
almost perfectly preserved. 



inal box has significant large scale power. Non- Gaussianity 
must play only a small role, since both boxes have identical 
non-Gaussian PDFs. 

In summary, foreground removal causes significant dis- 
tortion of the PDF, and it is directly linked to the removal 
of large scale power from the signal. Any detailed modeling 
of the 21cm PDF must find a way to overcome this; oth- 
erwise, inferences about reionization will be systematically 
biased. Furthermore, the degree of distortion will be red- 
shift dependent, increasing as reionization progresses, and 
the characteristic bubbles size (and hence amount of large 
scale power) increases. 



6 POSSIBLE IMPROVEMENTS TO 

FOREGROUND REMOVAL METHODS 

We have seen that distortion of the measured power spec- 
trum can by treated by marginalizing over the appropriate 



modes with k z <, fc z 



we can then recover an unbiased 
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estimator of the power spectrum, albeit with increasing vari- 
ance at small wavenumber, as shown in Fig. [2] On the other 
hand, there is no analogous marginalization procedure which 
can recover the undistorted PDF and (in the future) 21cm 
tomographic images. Here, we suggest two ways in which 
these distortions can potentially be overcome, which can be 
applied at the early and late stages of reionization respec- 
tively. This is fertile ground for future work. 



6.1 Maximizing the ratio of Bandwidth to Bubble 
Size 

Figures [141 and [T5l show that if the signal has relatively little 
large scale power to begin with, then foreground subtraction 
will produce little distortion of the PDF. Large scale power is 
generally a function of HII bubble size; the larger the bubbles 
are, the more large scale power there will be. The character- 
istic bubble size increases continually throughout reioniza- 
tion; there is less large-scale power during the early stages 
of reionization. Thus, PDF distortion should be less severe 
earlier on. On the other hand, Fig. [1] shows that foreground 
cleaning suppresses power on the characteristic lengthscale 
of the frequency slice over which the cleaning is done; if we 
can clean over larger lengthsc ales, then the m o des w e are 
interested in may be preserved l|McQuinn et al.l (|2006T ) have 
also made the latter point). 

These considerations suggest that PDF distortion will 
be smaller if we maximize the ratio of bandwidth to bub- 
ble size. Thus, the earlier stages of reionization will be less 
susceptible to PDF distortion, and we should use as large a 
bandwidth as possible to do the cleanin j"1 . Fig. [16] shows 
the power spectrum and the PDF of a 400 3 (cMpc) 3 box at 
z = 8 with x m = 0.3 (note that out fiducial 100 3 (cMpc) 3 
box at x = 8 has xm = 0.56). As expected, the power spec- 
trum has little power on large scales; thus, there is indeed 
little distortion of the PDF due to foreground cleaning. We 
can likely measure the PDF fairly accurately during the ear- 
lier stages of reionization; however, during the later stages, 
we must use a different strategy. 
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6.2 Large HII regions as Foreground Calibrators 

In [21 we saw that large HII bubbles were the primary rea- 
son for errors in foreground fitting. The foreground fit is 
pulled down or pushed up by large scale HII regions: ion- 
ized pixels are a source of highly correlated noise, since they 
tend to cluster together, and the polynomial fit to the fore- 
ground responds by incorrect curvature. Examination of Fig. 
[TT] suggests a solution: if we could 'renormalize' the ionized 
HII regions all to the same level (since there is no 21cm sig- 
nal there), then the incorrect curvature would vanish, and 
the large scale power incorrectly subtracted from the box 
would be restored. In practice, this can be accomplished by 

Note that it is not optimal to analyze the PDF over large 
bandwidths when cosmic evolution can be important. Rather, one 
should clean the data over larger frequency slices, and then an- 
alyze it over narrower slices, within which the neutral fraction 
is relatively constant. Over a broader bandwidth, the foreground 
shows greater curvature, and eventually one cannot clean the fore- 
ground well below the signal with a given polynomial fit. 



Figure 17. Comparison of the customary foreground cleaning 
method ('Regular') and cleaning with an additional step when 
the positions of bubbles are known ('Optimal'), for our usual 
100 3 (cMpc) 3 box. Foregrounds are fit in the presence of noise, 
with S/N~ 1. Top panel: large scale power is correctly recovered 
with the optimal algorithm. The bias at small scales is due to 
errors in the foreground fit due to noise. Middle panel: the distor- 
tion of the PDF is far less severe with the optimal algorithm, since 
large scale power is correctly preserved. Bottom panel: a slice in 
the plane of the sky of the original box (left), regularly cleaned 
(middle), and optimally cleaned box (right). Large bubbles have 
all been set to the same zero-point in the right panel, compared 
to the varied contrast in the middle panel. 



a second stage after the initial round of foreground clean- 
ing: fitting a polynomial only to the voxels in ionized regions. 
Subtracting off this polynomial ensures that all HII regions 
will be 'renormalized' to the same zero-point, as required. 

Applying this to a noiseless box with foregrounds re- 
sults in perfect recovery of the original signal box! Of course, 
in a realistic scenario, there are two problems: (i) we will 
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not know a priori the positions of fully ionized voxels; fur- 
thermore, finite telescope resolution will 'mix' neutral and 
ionized voxels, producing voxels with effective partial ion- 
ization, and reducing the number of fully ionized calibration 
points, (ii) We fit to foregrounds in the presence of noise, and 
so even if we knew where all the ionized voxels were, they 
scatter about the true zero-point. This scatter produces fit- 
ting errors in the second cleaning stage. Regarding (i), note 
that even if direct imaging of bubbles is difficult in the first 
generation of instruments, it is possible to classify voxels 
as belonging to ionized or neutral regions in a probabilistic 
fashion, via mixture model or tesselation algorithms. Image 
segmentation and/or edge detection in the presence of noise 
is a classic image processing problem, and there are many 
possible methods (e.g., Canny al gorithm, Voron oi teselation, 
void-finding algorithms (e.g., see lColberd (|2008l )). etc). Note 
that the impact of both (i) and (ii) are reduced when bubbles 
are large, in the later stages of reionization: the HII regions 
are easier to detect, and also the noise is averaged over more 
voxels. Fortunately, this is the regime in which this second 
cleaning step becomes necessary, since large bubbles are the 
ones to destabilize the foreground fit. 

We leave detailed quantitative investigation of such a 
cleaning algorithm in a realistic setting in conjunction with 
an image segmentation algorithm to future work. Here, we 
simply illustrate its potential. We consider an idealized case 
where we know the positions of all ionized voxels (in fact, 
only knowing the voxels in large HII regions gives compa- 
rable results), and add Gaussian noise such that S/N ~ 1 
for each voxel; this is less noise than is realistic for a first- 
generation instrument. We then fit for foregrounds in the 
presence of noise, performing the second 'renormalization' 
step as described. We subtract the imperfectly fitted fore- 
ground from a box with signal and foreground only, so that 
any errors in foreground subtraction are evident. The re- 
sults are in Fig. 1171 We see from the top panel that the 
large scale power spectrum is correctly recovered with the 
optimal algorithm. Thus, using this cleaning procedure, we 
may not have to marginalize over modes with low k z , and 
pay the price in increased sample variance at low k. The bias 
at high k is due to errors in the foreground fit due to noise; 
in practice, these will be correctly handled when solving via 
a maximum likelihood scheme. In the middle panel, we see 
that the distortion of the PDF is far less severe with the op- 
timal algorithm, and the intrinsic bimodality more correctly 
preserved. In the bottom panel, we compare a slice in the 
plane of the sky of the original box (left), a box cleaned with 
the regular algorithm (middle), and a box cleaned with the 
optimal algorithm (right). Note that large bubbles have all 
been set to the same zero-point in the right panel, compared 
to the varied contrast in the middle panel. This directly re- 
sults in the reduced distortion of the PDF. 



7 CONCLUSIONS 

In this paper, we focus on the removal of large scale 
power along the line of sight by foreground cleaning al- 
gorithms for upcoming 21cm surveys of the high-redshift 
universe. Whilst the subtraction of large scale power is 
well-known, the systematic biases it induces in the cleaned 
data, and how this propagates into various statistics, has 



not been well-understood. We emphasize that there are 
many aspects of the foreground removal problem that we 
do not address. For instance, we do not consider 'mode- 
mixin g' and the conse q uences of bright s ource subtrac- 
tion jLiu et al. (2009a)l; [Liu et al (2009b )l ; iBowman et~ail 
(2009); Datta et al.l l|201ol )); our for eground model is much 
less sophisticated than others (e.g., I Jelic et aU (|2008)); we 
largely ignore instrumental effects, such as leakage of po- 
larized foregrou nds to the total signal (| Jelic et al.l l20ld ; 
iGeil et aljlaoich ; and we do not examine the subtleties of 
vario us foreground cleaning algorithms (e.g., (|Harker et al.l 
2009)). Such issues have been dealt with by other authors. 
However, the removal of large scale power along the line 
of sight is to date common to all foreground subtraction 
schemes. We specialize to the case of linear subtraction 
schemes, and use the fact that the impact of foreground 
cleaning on the signal can therefore be considered in isola- 
tion, without considering its interaction with noise. 
Our principal conclusions are as follows: 

• Removal of large scale power along the line of sight 
aliases into suppression of power across a broad range of 
scales in 3D, an effect which can be understood analytically. 
We compare analytic expectations to foreground cleaning 
simulations and find excellent agreement. This systematic 
underestimate of the true power spectrum can be correctly 
accounted for by marginalizing over all modes with k z ^ 
fc z ,crit- However, the reduction in the number of measurable 
modes at a given wavenumber unavoidably increases sample 
variance. 

• We show that foreground cleaning has the same effect 
on Gaussian realizations of the power spectrum as it does 
on fully non-Gaussian boxes. This allows for rapid Monte- 
Carlo simulations. We perform maximum-likelihood Monte- 
Carlo simulations of power spectrum estimation with real- 
istic noise, where we correctly marginalize over foreground 
cleaned modes, and show that we are able to recover unbi- 
ased power spectrum estimates, with the minimum variance 
given by Fisher matrix estimates. 

• To date there have been no studies of the impact of fore- 
ground cleaning on the PDF. We show that in fact significant 
distortion of the PDF occurs, which would compromise at- 
tempts to mine the 21cm PDF for astrophysical information. 
By comparing two boxes with the same non-Gaussian PDF 
but very different power spectra, we show that the distor- 
tion is due to the removal of large scale power, rather than 
errors in least-squares regression due to the non-Gaussian 
PDF. 

• Foreground cleaning will also distort tomographic im- 
ages, by reducing the contrast between neutral and ionized 
regions. However, the topology of ionized regions is pre- 
served. The reduction in contrast is because errors in fore- 
ground cleaning strongly correlate with the presence of ion- 
ized regions, which lead to artificial curvature in the fore- 
ground fit. 

• Whilst the effect of foreground cleaning on the power 
spectrum is easily dealt with, correcting distortions in the 
PDF and tomographic images is more difficult. We make two 
suggestions. The first is to foreground clean over the largest 
feasible bandwidth over which the effects of evolution can 
be neglected. Since foreground cleaning removes power on 
the scale of the cleaning bandwidth, if there is little power 
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on these scales, there will be little distortion of the PDF or 
reduction of contrast between neutral and ionized regions. 
Essentially, the slice width must be significantly larger than 
the characteristic bubble size. Early in reionization, this con- 
straint will be easy, but more difficult in the late stages, 
as bubbles grow. For these later stages, identification of the 
largest ionized regions (which consist of foreground emission 
only) provides calibration points which potentially allow re- 
covery of large-scale modes. Detailed exploration of this is 
an obvious avenue for future work. 

• In the Appendices, we put to rest two concerns which 
have been raised from time to time, but never been calcu- 
lated in detail. One is that because of the extremely bright 
nature of continuum foregrounds, even very slight deviations 
from spectral smoothness could introduce frequency struc- 
ture which would swamp the 21cm signal. We show that 
synchrotron and free-free emission for even a single elec- 
tron has such a broad frequency response that tempera- 
ture fluctuations over the small measured frequency inter- 
vals are negligible; averaging over a huge number of electrons 
of course damps any fluctuations even further. Another con- 
cern is that the integrated extragalactic radio recombination 
line background might constitute a formidable foreground. 
We show that even with aggressive estimates for stimulated 
emission from star-forming galaxies — where RRL luminos- 
ity is ~ 10% of the radio continuum luminosity-the RRL 
background is unlikely to prove a problem. The reason is 
the small comoving volume probed: whereas all continuum 
sources along a line of sight contribute, for a specific recom- 
bination line, RRL emitters from only a tiny redshift slice 
contribute. 
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APPENDIX A: FOREGROUND 
FLUCTUATIONS IN FREQUENCY SPACE 

The generic property that all foreground cleaning algorithms 
use is that foregrounds should be spectrally smooth, allow- 
ing the line features of 21cm emission to be picked out. 
However, given that foregrounds can exceed the signal by 
~ 4 — 5 orders of magnitude, it is not immediately obvi- 
ous that tiny spectral variations in the foreground will not 
mask the 21cm signal. In this section, we quantify the re- 
quired level of smoothness, and show that synchrotron and 
free-free emission from our Galaxy and extragalactic sources 
indeed satisfy it. 

If the foreground is a perfectly smooth power-law T(, oc 
u~ a , then the 21cm forest can be extracted, since it consists 
of narrow line features. However, there are inevitable varia- 
tions in the spectral index with position in the sky as well 
as frequency, and our ability to subtract the galactic syn- 
chrotron foreground is dominated by the uncertainty in the 
spectral index. This dispersion is of order Aa ~ 0.03 — 0.1 
at ~ 200 MHz (jLawson et all Il987l ; iBandav fc Wolfendald 
Il99lf ). The required smoothness of the foreground in fre- 
quency space is therefore often cha racterized by uncer - 



tainty in the spectral i ndex Aa (e.g.. IShaver et alj (|l999f ) 



iDi Matteo et al.l (|2002h ; iGnedin fc Shaver! l|2004h TThe tem- 



perature uncertainty due to an uncertainty in the spectral 



index Aa is (AT/T) » {vjv ) L 



Aa(Av/v). The 



21cm fluctuations occur over a frequency interval (Au/u) ~ 
rjon/fe ~ 10~ 2 , where ri on is the characteristic size of ion- 
ized/neutral patches and Ih is the Hubble length. These 
correspond to a variation in the spectral index of: 



Aa(reion) « 5 x 10 



AT/T b 
5 x 10- ; 



Sv jv 

lO- 2 



(Al) 



index observed in our Galaxy observed from point to point 
and over fairly large frequency intervals, 8v/u ~ few x 0.1, 
it is not clear that the spectral index can be constrained to 
the ~ 10 -3 accuracy required. 

How strong will the deviations from power law emis- 
sion be? Synchrotron emission is a power law because the 
electron momentum distribution is a power-law (which is 
generically true, for instance, for Fermi acceleration). An 
electron momentum distribution N(-y)d^y oc r y~ p d r y will pro- 
duce a synchrotron emission spectrum F v oc i/ - "" -1 " 2 . How- 
ever, the shorter lifetimes of high-momentum electrons will 
introduce curvature even for an initially perfect power law. 
Moreover, turbulence, non-linear scattering, magnetic field 
strength fluctuations and plasma effects could introduce 
fluctuations in the momentum distribution. Will this trans- 
late into sufficiently large spectral index fluctuations so as 
to swamp the 21cm signal? 

It is easiest to think directly in terms of temperature 
fluctuations, rather than the spectral index. In order for 
galactic foregrounds not be a significant source of contam- 
ination, we shall simply require that the rms temperature 
fluctuations smoothed on 8v ~ 0.1 — 1MHz intervals be sig- 
nificantly less than the expected 21cm signal: 



{ATf g ) 1/2 < lO^K. 



(A2) 



The observed emission as a function of frequency T(uj), 
is the convolution of the electron momentum distribution 
AT (7) with the emission spectrum of a single electron E(uj): 
T(cj) oc N(7)®E(u). Thus, by the convolution theorem, the 
power spectrum of temperature fluctuations as a function of 
smoothing frequency Aojk is given by: 



p(fe„) = p(k- i )wi 



(A3) 



where Wk is the Fourier transform of the emission spec- 
trum E(uj). The window function Wk therefore quantifies 
the suppression of small-scale temperature fluctuations due 
to smoothing by the broad emission kernel E(u>). 

Let us now calculate W k ■ The power per unit frequ ency 
emitted by each electron is l|Rvbicki fc Lightmadll979r ): 



. . V3q 3 Bsina 
kj(bj) = — t (x) 



(A4) 



where a is the pitch angle between the magnetic field and 
electron velocity, while 



F(x) = x / K 6/a Zdt, (A5) 

J X 

lodified Bessel function c 
and x = u)/uj c . The critical frequency ui c is: 



where K^/ 3 is a modified Bessel function of the second kind, 



_ 3 3 . 
uic = —7 <-ob sma 



3^ 2 qB sina 
2m e c 



(A6) 



Given the large Aa ~ 0.03 — 0.1 variations in the spectral 



where 7 is the electron Lorentz factor. Note that the criti- 
cal frequency is larger than the electron gyration frequency 
ljb = qB/jnieC by a factor ~ 7 s . The power per frequency 
interval dv is simply P(y) = 2ttP(lj). The function F(x) is 
shown as the top panel of Figure IA1I Note that it is broad 
in frequency range: Av ~ v. This implies that fluctuations 
on scales Av <C v will be smoothed out. 

The window function Wk is simply the absolute value 
of the Fourier transform of F(x). In the bottom panel of 
Figure ET1 we plot Wk as a function of Xk = 1/k ~ Avjv. 
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In the bottom panel of Figure I All we plot Wk , as well as 
the window function for the left and right portion separately. 
Note that F(x) is asymmetric about its peak at x max = 0.29 
and the left (low-frequency portion) is much narrower than 
the right (high-frequency) portion. The latter sets the char- 
acteristic frequency width of the electron's emission. Wk is 
dominated by the narrow left portion, as seen by the sep- 
arate window functions for the left and right portion (the 
"ringing" in Fourier space for the left portion is due to the 
sharpness of edges). The window function clearly acts as a 
low-pass filter: all high-frequency fluctuations are strongly 
damped, since the emission spectrum E{u) is so broad. 21cm 
emission corresponds to x ~ 10 -4 /:r max ~ 10 -2 ' 5 ; at these 
small frequency intervals, Wk ~ 10 -4 . Thus, even if there 
are pathologically large fluctuations in the electron distribu- 
tion function A 7 = [&N('y)/N(pf)\ lms ~ 0.1 (averaged over 
bins of width (S7/7 ~ 5v/v ~ 10~ 4 ), they will be strongly 
suppressed to yield temperature fluctuations on these scales 
ST/T ~ W k A 1 ~ 10~ 5 (A 7 /0.1), smaller than the 21cm sig- 
nal ST/T ~ 10 -4 . In reality, stochastic fluctuations in the 
electron distribution function are likely to be much smaller 
than A 7 ~ 0.1: Poisson fluctuations are utterly negligible, 
since one is averaging over a huge number of electrons, so 
some unusual process (for instance, resonances in momen- 
tum space) would be necessary to produce sharp features 
in the electron momentum distribution. Thus, foreground 
temperature fluctuations over the narrow frequency inter- 
vals associated with 21cm emission are utterly negligible. 

We can perform a similar estimate for thermal 
bremsstrahlung emission, which is the emission spectrum 
E{y, v) of an electron of velocity v, convolved with a 
Maxwellian velocity distribution. However, the emission 
spectrum of even a mono-energetic electron population is 
very broad in frequency space: E(u, v) oc v~ 1 gff(v, v), where 
the Gaunt factor gff(v,u) is of order unity from low frequen- 
cies out to hv CT it ~ hmv . The free- free emission spectrum 
for a monoenergetic distribution is much broader than that 
for synchrotron emission, where Av ~ ^ob s : for free- free 
emission, Av ~ u clit ~ (k B T/h) ~ 100(T/10 4 K) THz > 
fobs ~ 150MHz. The window function will decay even more 
rapidly at small frequency intervals, and there will be no 
small-scale frequency structure for free-free emission, even 
for a turbulent plasma. 

Since extragalactic sources are dominated by free-free 
or synchrotron emission, their sum will also be smooth in 
frequency space. The only contaminants which might have 
sufficient small-scale structure in frequency space to be con- 
fused with the 21cm forest are radio recombination lines. 
Preliminary estimates show they are unlikely to be a sig- 
nificant source of contamination out of the Galactic plane 
jShaver et all 1 19991 : lohfc Mackll2003l ). but there could be 
surprises. Below, we estimate the integrated background of 
radio recombination lines from star-forming galaxies. 




APPENDIX B: THE RADIO RECOMBINATION 
LINE BACKGROUND 

In this Appendix, we perform simple estimates of the fore- 
ground contamination due to extragalactic radio recombi- 



Figure Al. Top panel: Frequency dependence of the synchrotron 
spectrum of a single electron with Lorentz factor 7, as given by 
equation dA5l l, in units of 1 = u>/u) c . The emission peaks at 
£max = 0.29; the low frequency portion ('L' for Left) damps more 
rapidly than the high frequency portion ('R' for Right). Bottom 
panel: The window function Wk, which is the Fourier transform 
of the emission spectrum F(x), as a function of Xk = 1/k. For 
the small frequency intervals relevant for the 21cm forest Xk < 
10 — 2 ' 5 , the window function has small amplitude, Wk ~ 10 -4 , 
implying that foreground temperature fluctuations are negligible 
on these scales. The decay of the window function is dominated by 
the narrow left portion of the emission spectrum (dashed lines), 
rather than the broad right portion (dotted lines). 



nation lines (RRLs), which have frequencies: 

v = 153 An (t^) 3 (1 +z) _1 MHz. (Bl) 

Extragalactic RRLs were first detected from the star- 
bursts in M82 a nd N GC 253 by IShaver et all (|l977l ) and 
ISeaouist fc Belli (|l977T l. It was soon pointed out that stimu- 
lated emission from a strong non-thermal background could 
allow di stant radio ga laxies and quasars to be seen in RRL 
emission IShaverl(|l978l ). This expectation has not been borne 
out, likely because the volu me filling factor of HII region s 
around radio quasars is small (|Anantharamaiah et alj[l993l ) . 
Nonetheless, they are detectable in bright nuclear starburst 
regions in nearby galaxies; t o date, there ar e 15 k nown ex- 
tragalactic RRL detections (|Rov et al.ll2008l . l20ld i. Models 
of RRL emission are highly uncertain and sensitive to the 
unknown gas density and geometry in nuclear regions. At 
higher frequencies, emission appears to be due to a mixture 
of spontaneous emission and stimulated emission by free-free 
continuum within the HII regions, while at lower frequencies 
stimulated emission by non-thermal continuum can be im- 
portant. In this paper, we adopt a crude conservative upper 
bound on RRL emission. We find the level of contamination 
is so small that more sophisticated estimates are unneces- 
sary. 

We begin by estimating the luminosity due to internal 
emission (spontaneous and stimulated) in HII regions. Under 
optically thin conditions, and considering only the strongest 
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a lines ( An = 1), the flux from an HII region in R RLs is 
l|Shaverl l| 19781 ); also see lGordon fc Sorochenkol (|2002T )): 



AS L 



0.72^ n e Mmi 



AVT t 



3/2 D 2 



6«(1 - PnTc/2) mJy 



where n e (in cm -3 ) is the gas density, AZhii (in Mq) is the 
mass of ionized hydrogen, AV (in kms" 1 ) is the line width, 
T e (in K) is the temperature, v (in GHz) is the frequency, 
and 6 n (l — /3 n rc/2) captures departures from thermal equi- 
librium 1 2 ! . Th e dependence on n is weak as long asn> 1; see 



IShaverilfl978l l for the full expression. Assuming that the es- 
cape fraction of ionizing photons is small, ionization equilib- 
rium in HII regions within the galaxy implies that even 2 V = 
a B n(M H n/m p ) = N ion = 10 53 s" 1 (SFR/1M Q yr" 1 ), which 
allows us to related the quantity n e Mmi in equation (|B2|) 
with the star formation rate (SFR) of a galaxy. Performing 
this substitution, we obtain the RRL luminosity for inter- 
nally generated radiation: 



L" 1 



1.3 x 10 25 ergs" 1 Hz" 



SFR 



1M Q yr 



AV 



200^18-! I \10 A K 



-3/2 



Mi 



GHz 

PnTc/2)' 



(B3) 



100 



Note that we have adopted an extremely optimistic ex- 
pression for the boost factor due to stimulated emission, 

Mi-tc/V 2 ) ~ 10 °- 

Let us proceed to estimate the RRL luminosity due to 
stimulated emission from external radiation. Consider a col- 
lection of compact HII regions around a starburst. One can 
place a conservative upper bound on stimulated emission 
by considering an extreme case where N HII regions lie in 
front of a uniform ly distributed background emission S'cbg 
jZhao et alJll996h : 



ASl = A^mi 



AVh 

AVoi 



Sc 



! (e- 



(B4) 



where A?i os is the number of HII regions along the line of sight 
(generally Ni os < 1 observationally, and is required in this 
formula since we ignore shadowing), AVhii ~ 20 kms" 1 is 
the Doppler width of individual HII regions, AVobs is the ob- 
served line width (either due to motions of HII regions within 
the galaxy, or the width of the observing channel) , and tl , re 
are the line and continuum (free-free) optical depths of each 
HII region. When AV b s = AVhii, and for tl, tq <IC 1 (which 
is usually the case), ASl ~ — -^Hn^^cbg) which makes in- 
tuitive sense: the line-to-continuum ratio is simply the cu- 
mulative optical depth of the line (note that for stimulated 
emission, tl < 0). The factor of AVhii/ AVobs simply ex- 
presses frequency dilution. In this paper, we shall conser- 
vatively assume a very high fidicial value of N^tl ~ 0.1. 
While models of the observed emission from starbursts gen- 



erally yields A/gfiTL ~ 10" 



10" 4 at 8.3 GHz, r L oc v~ 



12 In particular, tq is the continuum (free-free) optical depth, 
while b n = N n /N* is the departure coefficient which relates the 
population of atomic energy level n, N n , to its LTE value TV*. 
The central line optical depth is then related to the LTE value 
by tl — fe n /3n-r£, where /3„ = 1 - (fcT e )/ (hu)(AbAn)/b n - Under 
conditions appropriate for strong stimulated emission, typically 
b a p n ~ 10 - 100. 



and hence could be larger at lower frequencies. Note that 
the ratio of continuum to line optical depths is: 



(B2) IH 

TL 



AVh 



20 kms- 



(l50MHz) 



100 



2. 

10 4 



(B5) 

Thus, increasing tl to values of order unity will also in- 
crease free-free absorption, resulting in unobservab le RRL 
emissi o n. Note also that the single "s la b" model of IShaveij 
|l97Sl) ; lAnantharamaiah et all (fl993h : IZhao et all (|l996r ) 
simply corresponds to N^fi — > 1, such that 5l w TL^cBg- 

We now need a model for S'cbg- At low frequencies, non- 
thermal emission dominates over free-free emission, and will 
be the primary source of stimulated emission. Since the cov- 
ering fraction of HII regions has been genera lly found to be 
small (e.g., in lAnantharamaiah et ahl (|1993), /hii ~ 10 -6 ), 
stimulated emission from an AGN is generally unimportant. 
We model the non-thermal emission to be primarily syn- 
chrotron emission from supernova remnants. Th e observed 
corre lation between SFR and radio luminosity is l|Yun et al.l 
l200ll) : 



Z^ bg = 2.2 x 10 28 ergs" 1 Hz" 1 



SFR 



IMoyr- 1 J VI GHz 



(B6) 



Then from equation IB4I we have: 



^external = 3 3 x 1Q 26 ergg -l ^-1 , V \ 8 ( ^N ml 



SFR 



1( U/J \ (tl 
AVkn \ / AVobs \ ' [ I + : 



IMoyr- 1 J V 20 kms- 1 J VSOOkms" 1 

We stress that these fiducial parameters (6„(1 — rc/3n/2) ~ 
100, tlA^hh ~ 0.1) are by design significant overesti- 
mates to conservatively place an upper bound on RRL 
emission. For instance, from pu blished values of_SFR~ 
270 M yr" 1 l|Shiova et ahl |200lD ~ ^ M^yr" 1 l|Boissierl 
l2005h . ~ 6M Q yr _1 (lYoung et al.lll988h and m easured line 
widths of 320, 95, 200 kms" 1 (|Zhao et aljfl996l ) for Arp 220, 
M83, and NGC 2146 respectively, our model predicts line 
flux densities of S L ~ 4.0, 44, 4.7 mJy for the H92a line at 8.3 
GHz, compared to observed values of Sl ~ 0.4, 0.8, 0.36mJy 
l|Zhao et al.1 Il996l ). For the H165a and H167a lines at 
1.4 GHz, we predict a flux of ~ 5mJy, while an up- 
per bound on the peak line density is < 0.25 mJy (3cr; 
lAnantharamaiah et al.l (2000)). Thus, as is our intent, we 
consistently overestimate the RRL flux by at least an order 
of magnitude (we do not advocate using these parameters 
to estimate the detectability of RRLs!). 

It is clear that since Z/j 11 oc v, while L° xt oc v~ ' 8 ', they 
dominate at high and low frequencies respectively. In our 
model, the cross-over point is at a rest frame frequency of 
v ~ 4.8GHz. Since we are concerned with RRLs at rest frame 
frequencies well below 1.4 GHz, the RRL emission stimu- 
lated by external non-thermal sources strongly dominates. 
In terms of parameters appropriate for 21cm experiments, 
the RRL luminosity at low frequencies is then: 



L external A r» . i n25 —1 tt —1 / 

v — 4.2 x 10 ergs Hz I 



fobs 



150 MHz 



( SFR 



AVk 



VlMeyr" 1 J VaOkms- 1 J \IMR 



Av 



l + z 



where we have substituted Av for AVobs; in general the line 
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will always be unresolved by the detector unless bandwidths 
are extremely small, Av ~ 0.1 MHz. 

We can now estimate the typical brightness temperature 
perturbation due to an integrated background of RRL emit- 
ters. Assuming a comoving star formation rate of esfr, and 
the Rayleigh Jeans approximation XL = c 2 Sl/(2 fei/ 2 6 2 ), 
where 9 is the beamsize, we obtain: 

f^ 1 .8x 1 0-K( 01Mo ^ Mpc _, ) (B9) 

Vl50MHz/ V20kms-V VlMHz/ V 3 / 

where z refers to the redshift of the RRL emitting galax- 
ies, and not of 21cm emission. Note that this is indepen- 
dent of the bandwidth Av or the beamsize 9, since XL 
has units of surface brightness per frequency interval (in- 
creasing Av, 9 will increase the RRL luminosity of a re- 
gion, but it will be spread out over larger angular and 
frequency intervals). However, the large comoving volume 
Vcom ~ 12 3 (Az//^ obs /(l/15O))(0/5') 2 Mpc 3 in a 21cm exper- 
iment's field of view is important in estimating rms temper- 
ature fluctuations <5Xl. Since rms density fluctuations are 
well below unity on these scales (erg ~ 0.8 at z — 0), an 
enormous bias factor b > 10 3 is required for fluctuations in 
the comoving star formation rate <5esFR to produce an obser- 
vationally relevant cosmological RRL signal, <5Xl ~ 10 _2 K. 
Such large bias is obviously unrealistic. 

Thus, it appears fairly robustly that the RRL back- 
ground is unlikely to be important. Physically, we can under- 
stand why this is the case. We modelled L RRL ~ T L N^fjL^ bs 
where r^N^fj ~ 0.1, so naively one might think that the 
RRL foreground is ~ 10% of the radio continuum foreground 
(which we know significantly exceeds the 21cm signal). How- 
ever, whereas all continuum sources along a line of sight con- 
tribute, for a specific recombination line, RRL emitters from 
only a tiny redshift slice contribute^- Thus, the very reason 
why RRLs might pose a danger — the fact that they are lo- 
calized in frequency space — is also the reason for their small 
amplitude, since that implies localization in redshift space. 
Our only note of caution is that observations of extragalac- 
tic RRLs have taken place at signific antly higher frequencies 
(e.g., at 1.4, 8.1, 84, 96 and 207 GHz; lAnantharamaiah et al.l 
(2000)); emission at significantly lower frequencies may have 
different physics. In particular, since tl oc lower fre- 

quency RRL emission is sensitive to lower density HII re- 
gions. However, it is difficult to see how one can vitiate the 
conservative upper bound we have placed. The matter could 
be quickly put to rest by measuring the RRL intensities of 
radio galaxies and starbursts at these frequencies. 



3 It is possible that RRLs at different frequencies and thus red- 
shifts can contribute to a fixed bandpass, but they will add in- 
coherently and the signal will be small. The RRL foreground is 
really only a danger if emission from a given line will be large, 
since only a small fraction of starbursts exhibit stimulated emis- 
sion. 



